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Abstract 

Soil organic carbon (SOC) plays a major role in the global carbon budget. It can act as a source or a sink of atmospheric 
carbon, thereby possibly influencing the course of climate change. Improving the tools that model the spatial distributions 
of SOC stocks at national scales is a priority, both for monitoring changes in SOC and as an input for global carbon 
cycles studies. In this paper, "we compare and evaluate two recent and promising modelling approaches. First, we 
considered several increasingly complex boosted regression trees (BRT), a convenient and efficient multiple regression 
model from the statistical learning field. Further, we considered a robust geostatistical approach coupled to the BRT 
models. Testing the different approaches was performed on the dataset from the French Soil Monitoring Network, 
with a consistent cross-validation procedure. We showed that when a limited number of predictors were included in 
the BRT model, the standalone BRT predictions were significantly improved by robust geostatistical modelling of the 
residuals. However, when data for several SOC drivers were included, the standalone BRT model predictions were not 
significantly improved by geostatistical modelling. Therefore, in this latter situation, the BRT predictions might be 
considered adequate without the need for geostatistical modelling, provided that i) care is exercised in model fitting 
and validating, and ii) the dataset does not allow for modelling of local spatial autocorrelations, as is the case for many 
national systematic sampling schemes. 


1. Introduction 


Soils are the second biggest carbon pool of the planet. 


containing about 1500 PgC (Batjes, 1996 Eswaran et al. 


1993 Post et al. 1982). As such, their behaviour as a 


greenhouse gas source and sink needs to be quantified, 
when facing climate change induced by increasing atmo¬ 


spheric greenhouse gases concentrations (Batjes, 1996 Lai 


2004). Quantifying temporal changes of this pool requires 


estimating its spatial distribution at different dates and at 
various scales, with the national scale being of particular 
importance for international negotiations. The reliability 
of such estimates depends upon suitable data in terms of 
organic carbon content and soil bulk density and on the 
methods used to upscale point data to comprehensive spa¬ 
tial estimates. These estimates may also be used for defin¬ 
ing the baseline state for soil organic carbon (SOC) change 


simulations (van Wesemael et al. 2010), or setting some 


of the parameters for models of SOC dynamics (Tornquist 


et al. 2009). 


Interestingly, there is quite a diversity regarding the na¬ 
ture of the models used for upscaling SOC point measure¬ 
ments to the national level. The validity of each method 
depends on the datasets and on the scale (defined by its 


grain or precision and extent. Turner et ah, 1989). The 


mapping approaches range from simple statistics or pedo- 
transfer rules, relating SOC contents or stocks to soil type 


(Yu et al., 2007) or soil type and land use (Tomlinson and 


Milne, 2006 Arrouays et al., 2001), to multivariate regres¬ 
sion model s ([Meersmans et ah , 2008, with multiple linear 
models and Yang et al. , 2008 with generalized linear mod¬ 
els or Suuster et ah, 2012[ with mixed models). Recent 
studies have used techniques adapted from the data min¬ 
ing and machine learning literature, with piecewise linear 


tree models ( 

Bui et al. 

2009 

) or multiple regression trees 

for regional studies ( 

Crimm et al. 

2008 

Lo Seen et al. 

2010 

Suuster et al.l 

2012 

). Among the studies consider- 


ing small extent ((50 kinQ, many have considered the use 
of geostatistics, some including SOC predictors via cokrig- 


ing (CK) or regression kriging (RK) ( 

Mabit and Bernard 

2010 

Don et al. 20071 |Rossi et al.l|2009 

Yun-Qiang et al. 

2009 

Spielvogel et al. 

2009). As the extent increases, the 


use of geostatistics becomes less common and despite the 
spatial dimension of such studies, few geostatistical ap¬ 
proaches for SOC mapping have been proposed for use 


at the national scale (but see 

Chaplot et al. 

2009 

Kerry 

et al. 

2012 

Rawlins et al., |2009 

)■ 
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SOC mapping for France has been performed, during 
the last decade, by using class specific SOC means (Ar- 


rouays et al. 

2001 

) or I 

2011 

Meersmans et al. 


2001) or regression models (Martin et al. 


2012). The most recently pro¬ 


posed models are still not able to fully satisfactorily pre¬ 
dict SOC stocks or contents on independent locations : 
reached 0.50 and 0.49 and root mean squared prediction 
errors (RMSPE) 2.27 kg/m^ and 1.45%, for Martin et al. 


(2011) on SOC stocks and for Meersmans et al. (2012) 
on SOC contents, respectively. Martin et al. (2011) ob¬ 


tained unbiased predictions (the bias was estimated to be 
-0.002 kg/m^ by cross-validation), which might ensure un¬ 
biased mapping of the stock at the national level. Never¬ 
theless, these R^ and RMSPE results showed that there is 
potentially room for improvement, especially if one is will¬ 
ing to use such models for regional assessments. Adding 
spatial autocorrelation terms in these models might be a 
way to improve their performance. 

Recently, new approaches have been proposed for cou¬ 
pling regression models, relating environmental factors to 
the studied property, with geostatistical models, represent¬ 
ing the spatial autocorrelation among the observations (see 
for example Marchant et al. 2010). Such methods were 


also designed to handle local anomalies {i.e. outliers). 
Nevertheless, these methods do not currently include some 
features that other statistical models, such as boosted re¬ 


gression trees (BRT) used by Martin et al. (2011), have 
(i.e. handling nonlinear relationships between qualitative 
and quantitative predictors and the independent variable, 
nonlinear interactions between the predictors, in an auto¬ 
mated manner). Both approaches share the robustness to 
the presence of outliers in the dataset. As they are tack¬ 
ling different problems, the spatial autocorrelation for the 
geostatistical approaches, and the modelling of the com¬ 
plex interactions between SOC stocks and their drivers for 
the regression methods, both might be considered as com¬ 
plementary. 

The aim of this paper is to combine these recent robust 
geostatistical approaches with the BRT models currently 
applied to map SOC stocks at the national scale for France. 
We apply the methods to a dataset of 2166 paired obser¬ 
vations of SOC and bulk densities from the French soil 
quality monitoring network (RMQS). We use this study 
to assess the modelling methods to determine i) how use¬ 
ful it is to combine BRT and geostatistical modelling, and 
ii) if any advantages are dependent on the number of an¬ 
cillary variables included as predictors in the BRT models. 
The aim is not specifically to study the relative importance 
of SOC stocks drivers for France (which has been done re- 


(Fig.[T]). The network is based on a 16 km x 16 km square 
grid. The sampling sites are located at the center of each 
grid cell, except when settling a homogeneous 20 m x 20 m 
sampling area is not possible at this specific location (be¬ 
cause of the soils being sealed or strongly disturbed by 
anthropogenic activities, for instance). In that case, an¬ 
other site is selected within 1 km from the center of the 
cell depending on soil availability for sampling (for more 


information, see Arrouays et al., 2002). Some of the 2166 


sites of our dataset were actually replicates of the regu¬ 
lar cells sites : some cells had two sites located in them, 
one close to the center of the cell as described above, and 
another one located at another position within the cell. 

At each site, 25 individual core samples were taken 
from the (0-30 cm) and the subsoil (30-50 cm) using a 
hand auger according to an unaligned sampling design 
within a 20 m X 20 m area. Individual samples were mixed 
to obtain a composite sample for each soil layer. In ad¬ 
dition to the composite sampling, a soil pit was dug 5 m 
from the south border of the 20 m x 20 m area, from which 
6 bulk density measurements were done, as described pre- 

From these data. 


viously (Martin et al. 2009). 


SOC 


stocks (kg/m^) were computed for the 0-30 cm soil layer : 


SOCstocksgo cm = ^p*BDiSOC,(l - r/J (1) 

i=l 

where n is the number of soil horizon present in the 0- 
30cm layer, BD^, rfi and SOCi the bulk density, percent¬ 
age of rock fragments (relative to the mass of soil) and the 
SOC concentration (percent) in these horizons, and Pi the 
width of the horizons to take into account to reach the 
30 cm. The horizons considered for such an analysis did 
not include the organic horizons (such as OH or OL). 

The SOC stocks, the dependent variable, is the only 
variable which was observed at site level. All other vari¬ 
ables, the covariates (or ancillary variables) were depicted 
using available maps covering the French territory. This 
allowed us to consider models for mapping SOC distribu¬ 
tions at the national scale, which relies on the exhaustive¬ 
ness of the ancillary information. These ancillary maps 
were thus sampled at RMQS sites locations in order to 
estimate climatic, pedological, land use and management 
related, and biological variables. 

The map of pH was derived from two sources. For 
the forest soils, the forest soils surface pH map (UMR- 


LERFOB and Ifn 2008) was used. For the other soils, 


the median pH per district from the national soil test- 


cently [Martin et al.|[MTT| [Meersmans et al.[|2012[), nor to ing database was used (BEAT, [Lemercier et al.[ [2006[ ). 


produce a new map of SOC stocks in France. 

2. Materials and methods 

2.1. Data 

Soil Organic Carbon Stocks were computed for 2166 
sites from the French soil quality monitoring network (RMQS) 
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Land use was estimated from from Corine Land Cover 
2006 database and further reclassed into an adapted IPCC 
land use classification (various crops, permanent grass¬ 
lands, woodlands, orchards and shrubby perennial crops, 
wetlands, others and vineyards) (UE-SOeS, 2006). Clay 


content was estimated from the 1:1 000 000 scale Euro¬ 
pean Soil Geographical Database (King et al. 1995). As 



































































each polygon (or soil unit) from the 1:1 000 000 scale Eu¬ 
ropean Soil Geographical map is linked to possibly several 
soil types (hence clay levels), we used in the models the 
clay levels of the 3 most important (in terms of surface) 
soil types within each soil unit associated to each RMQS 
site, namely clayl, clay2 and clayi, ranked according to 
the percentage of their occurrence. Surface percentages of 
these soil types were also included as predictors within the 
models (pci, pc2 and pc3). For instance, let us consider a 
given RMQS site i belonging to soil unit j of the soil map. 
The soil unit j may have two soil types associated to it (stl 
and st2) with the occurring probabilities of 70% and 30% 
and clay levels of 45% and 35% . For this site i, the values 
of the clayl, clay2 and clayS variables would be 45%, 35% 
and NA (not available) respectively and for pci, pc2 and 
pc3, 70% and 30% and NA respectively. Organic matter 
additions (oma), such as slurry and farmyard manure were 
estimated. We used manure application and animal excre¬ 
ment production departmental statistics (ADEME 20071. 
These statistics were combined with dry matter C concen¬ 
tration values, ( Meersmans et ah] 2012, 37.7 % for farm 
yard manure and 36.6 % for slurry,). 

Climatic data were monthly precipitation (mmmonth“^), 
potential evapotranspiration (PET, mmmonth”^), and tem¬ 
perature (°C) at each node of a 8 x 8 km^ grid, averaged for 
the 1992-2004 period. This climatic map was obtained by 
interpolating observational data using the SAFRAN model 
(Quintana-Segui et al. 20081. Again, for the modelling 
study presented here, climatic variables were estimated at 
each RMQS site by performing a spatial join between the 
RMQS grid and the climatic map. 

Agro-pedo-climatic variables were also derived from 
the primary soil, climate and land use data estimated 
at each RMQS site: we used the (a) temperature and 
(6) soil moisture mineralization modifiers, as modelled in 
the RothC model (Coleman et al.[ 1997[ Martin et al. 


[2011 ). The b variable was calculated by combining, for 
each RMQS site, rainfall and PET data obtained from the 
climatic grid, with site observation of land use and clay 
content. Since three possible clay contents were estimated 
for each site, the three corresponding estimates of the b 
variables were also included, when relevant, in our BRT 
models. Lastly, the Moderate Resolution Imaging Spec- 
troradiometer Net Primary Productivity (MODIS NPP, 
gCm^yr) was used to get NPP estimates at each of the 
RMQS sites, as in (Martin et al., 2011). 

The CIS processing was carried out using the GRASS 
CIS (GRASS Development Team, 2012) and further map¬ 
ping was carried out using Generic Mapping Tools software 
(Wessel and Smith, 1991). 
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Figure 1: SOC stocks (0-30cm) values on the French mon¬ 
itoring network, which were used in the present study. 
Areas from 1 to 7 represent various different areas that 
are mentioned later in the text. 1: south-west Brittany. 
2: part of Basse Normandie. 3: Alsace and part of Lor¬ 
raine. 4: part of French Alps. 5: Massif Central. 6: French 
Pyrenean mountain range. 7: part of Aquitaine. 


2.2. Statistical modelling 

2.2.1. Boosted Regression Trees (BRT) modelling 

Boosted regression trees belong to the Gradient Boost¬ 
ing Modelling (GBM) family. The objective is to estimate 
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the function F that maps the values of a set of predic¬ 
tor variables x = {xl,..,xp} onto the values of the out¬ 
put variable y, by minimizing a specified loss function L. 
This L function is applied at each iteration in order to fit 
so-called base learners. The final prediction of the BRT 
model is a linear combination of each base learner pre¬ 
diction. The constant weight associated with these base 
learner predictions is called the learning rate and is one 
of the important parameters of this boosting algorithm 
(Freund and Schapire 1996). This kind of algorithm is 
also referred to as a forward “stagewise” procedure. The 
base learners of BRT are classification and regression trees 
(Breiman et al., 19841. Furthermore, BRT uses a spe¬ 
cialized form (for regression trees) of Stochastic Gradient 
Boosting (Friedman 2001). The stochastic characteristic 
of the algorithm relies on the fact that only a subset of 
the dataset is used for fitting the base learner on a given 
iteration. The subset is produced in each iteration using 
a uniform random draw without replacement. Besides the 
learning rate, other parameters are important when ap¬ 
plying this kind of model. Two of them determines the 
characteristics of each base learner : the tree size (which 
gives the size of individual regression trees) and the mini¬ 
mum number of observations in the terminal leaves of the 
trees. Several options are available for deciding when to 
stop adding base leaners to the model. One of them, based 
on an internal cross-validation, was shown to be the most 
efficient one (Ridgeway 2006) for avoiding overfitting and 
was used for the present study. BRT was shown to have 
improved accuracy compared with simple regression trees, 
thanks to its stochastic gradient boosting procedure aimed 
at minimizing the risk of overfitting and improving its pre¬ 
dictive power (Lawrence et al., 20041. It can handle non¬ 
linear interactions among predictors and the dependent 
variable, quantitative and qualitative predictors and miss¬ 
ing data. Lastly, several tools are available for interpreting 
the behavior and characteristic of the resulting BRT mod¬ 
els, such as the variable importance index for assessing the 
contribution of the predictors and the partial dependence 
plots for assessing the relationships between predictors and 
the predicted variable (Martin et al. 2011). 


A thorough description of the method is given in Fried- 


man 


(2001) and a practical guide for using it in Elith et al. 


(2008). The BRT models were fitted and used for pre¬ 
diction using the “gbm” R (R Core Team 2013) package 
(Ridgeway 2006). 


2.2.2. Three BRT Models for SOC stocks 

Three models for predicting SOC stocks in the 0-30 cm 
layer were tested. The models, which we refer to as the 
LU, L and F models, have increasing levels of complex¬ 
ity (see below for their full description of these models). 
These three models were chosen as they represent cases 
where either very little or a lot of information on ancillary 
variables is known on sites where SOC stocks are to be 
predicted. Additionally, the first model (LU), with two 
covariates (landuse and clay content) commonly used for 


predicting SOC within the geostatistical framework. The 
second one (L, see below the full description) is indeed the 
Extra model presented in Martin et al.| (2011 ). The use of 
the most complex model (F) enabled us to include all the 
ancillary data available for France at the national level at 
the time of the present study. 

The predictors used for each model were: 


• LU: lu-ipcc (land use classification adapted from the 
IPCC guidelines, 2006), clayl. 

• L: luJpcc, clayl, clay2 and clay3, pci, pc2 and pc3, 
the clay and corresponding probability of occurrences 
at each RMQS site, {pet, mmmonth”^), monthly 
precipitation {rain, mmmonth“^), temperature {temp, 
°C), the two RothC mineralization modifiers, a and 
61, 62 and 63 and the net primary productivity npp 
(gCm-2yr-i). 

• F: same predictors as the model L with the addition 
of pH, oma {i.e. organic matter addition, slurry and 
farmyard manure) and pm, the parent material. 


The “gbm” R package requires the specification of sev¬ 
eral important parameters : the tree size, the learning 
rate, the minimum number of observations in the ter¬ 
minal leaves of the trees and the bag fraction. For our 
three models, the values for these parameters were set 
to (12,0.01,3,0.7). These values were chosen according 


to recommendations found in the literature (Elith et al. 
2008[ Ridgewa;^ 2006). 


2.2.3. Geostatistical models 

We further investigated whether a robust geostatistical 
method, similar to the one presented by Saby et al. (2011), 
could be used to represent errors and improve predictions 
from each of the BRT models. In their work, Saby et al 
(2011) divided the spatial variation of a soil property into 
fixed and random effects. The fixed effects were a different 
constant mean soil property for each of 12 parent mate¬ 
rial classes and the random effects described the spatially 
correlated residual soil property variation. In the present 
work, each of the three BRT models was used alone as pre¬ 
sented in the previous section and as a fixed effect within a 
robust geostatistical method. This combination of a BRT 
and geostatistical model can be summarised as: 


Z = H{X) + u (2) 

where Z = ln{Y) with Y being a length n vector of 
observations of the SOC stocks, X the matrix (n x q) con¬ 
taining values of the covariates (or predictors) at each ob¬ 
servation site and the H function representing the boosted 
regression tree model (fitted to the log-transformed data). 
We note that the log-transform was necessary for the geo¬ 
statistical approach due to skewness of the observed SOC 
stocks distribution. Thus the vector u of length n con¬ 
tains the residuals of the BRT model predictions of the 
log-transformed data, compared to the log transformed 
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response variable. In the conventional geostatistical ap¬ 
proach, these residuals are assumed to be a realization 


of a second order stationary random process (Webster and 


Oliver 2007). We applied a robust geostatistical approach, 


in which the spatial correlation of residuals was modelled 
using a Matern equation based on the Dowd robust esti¬ 


mator of the experimental variogram (Dowd, 19841. More¬ 


over, outlying observations were identihed and Winsorized 


using the algorithm proposed by Hawkins and Cressie (1984) 


Winsorizing is a method by which extreme values in the 
statistical data are limited to reduce the effect of possibly 
spurious outliers. Note that Winsorizing is not equivalent 
to simply excluding data. Rather, in a Winsorized proce¬ 
dure, the extreme values are replaced by a certain value 
predicted by a statistical model. This algorithm provided 
for each observed residual Ui an interval [U ~, ]. Ui is 

then identihed as being an outlier when Ui ^ \U ~, Uf'] and 
its value is replaced by the closest limit of the interval. As 


Lacarce et al. (2012), observations were conhrmed as be¬ 


ing outliers, and transformed, conditionally on a measure¬ 
ment error of SOC stocks of eT, with e = 0.112, recently 
estimated for the RMQS dataset (unpublished data) : 


- ifZn(y,(l + e))-i7(X,) <C/- 

iUn{Y,{l-e))-HO^i)>Ut 

i otherwise 


( 3 ) 


where ui* represents the resulting Winsorized data. 
One should note that the geostatistical modelling is per¬ 
formed on the log scale, but the measurement error is valid 
on the original scale, hence the terms on the left-hand side 
of the inequalities in equation These inequalities mean 
that the observed residuals may exceed the \U ~, ] in¬ 

terval limits, but not by more than the possible measure¬ 
ment error on the observed values. If they do, they are 
Winsorized to U~ or depending on the case. The U~ 
and values are defined so that the validity of the spa¬ 
tial term u in equation]^ is verihed, which, without Win¬ 
sorizing, was rarely the case in previous studies ( [Marchant 
et al. 2010 [Lacarce et al.[[2M2 |. This check is performed 
using a leave-one-out cross validation (LOOCV). When 
the covariance model is a valid representation of the spa¬ 
tial variation of the property (in our case the residuals), 
the distribution of the squared standardized prediction er¬ 
rors (noted 9) derived from the cross valid ation will be a 
with mean 0 = 1 and median 6 = 0.455 (Marchant et al. 


2010) for which confidence intervals may be determined. 


This LOOCV procedure aims solely at checking the valid¬ 
ity of the geostatistical model and should not be mistaken 
with the global validation framework presented in the next 
section, aimed at estimating the predictive performance of 
the models (BRT models and their spatial counterparts). 

As a result the variation of the soil property is decom¬ 
posed in a threefold model described by [Marchant et ah] 
(2010): 1) variation modelled by the BRT models, 2) spa¬ 
tially correlated variation represented by the random effect 


of the residuals of the BRT models and estimated by vari- 
ograms using Dowd’s estimator to which Matern equations 
were fitted, 3) variation due to circumscribed anomalies. 
Once the BRT and geostatistical models were fitted, the 
property was predicted at each unsampled (ie not used 
for fitting the models) location of the dataset by lognor¬ 
mal ordinary kriging. This method consists of predicting 
the residual for the log-transformed variable by ordinary 
kriging based on Winsorized data u* (equation]^, and 
back-transforming the predicted value to the original SOC 
stocks scale through: 

Y{xi) = expiHCKi) +Ui +var[ui]/2 - i’ixi)) (4) 

where Ui is the ordinary kriging prediction of u at a 
given prediction location a;^, var[ui\ is the associated krig¬ 
ing variance and tp the Lagrange multiplier; both the krig¬ 
ing variance and Lagrange multiplier are needed to yield 
unbiased estimates in case of lognormal ordinary kriging 
( [Webster and 01i-(^ 2007| . 


2.2.4- Validation procedure 

We thus considered six models: three models without 
a spatial term (the LU, L and F BRT models) and what 
is hereafter referred to as their spatial counterparts (the 
LUg, Lg and Fg models), i.e. the same three models with 
an additional spatial term (Eq. . These six models were 
validated using cross-validation. This validation procedure 
involves validation against independent data and enables 
estimation the predictive power of the proposed models. 

Comparison between observed and predicted values of 
SOC stocks was carried out on the original scale using 
several complementary indices, as is commonly suggested 
( [Schnebelen et al.' 2004): the mean prediction error (MPE, 
kg/m2), the root mean square prediction error (RMSPE, 
kg/m2) and the coefficient of determination (i?^) measur¬ 
ing the strength of the linear relationship between pre¬ 
dicted and observed values. Additionally, the ratio of per¬ 


formance to inter-quartile distance, RPIQ (Bellon-Maurel 


et al. 2010) was estimated as 


RPIQ = _J9v - 

^ RMSPE 


( 5 ) 


where IQg is the inter-quartile distance, calculated on ob¬ 
served SOC values from the whole dataset. RPIQ index 
accounts much better for the spread of the population 


than indexes such as RPD (Bellon-Maurel et al. 2010) 


and was used for comparing the prediction accuracy be¬ 
tween the six different models. Median prediction error 
and root of median of squared prediction errors were also 
calculated (hereafter named MedPE and RMedSPE re¬ 
spectively) . These additions to MPE and RMSPE respec¬ 
tively, provide a more complete picture of the errors in 
case of a skewed error distribution. 

The validation procedure was done using a Monte Carlo 


10-fold cross-validation (Xu and Liang, 2001), enabling 
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US to perforin what will be referred to in the following 
as external validation. It was preferred to simple data- 
splitting because the estimate of a model’s performance 
then does not rely on the choice of a single sub-sample. 
We preferred a k-fold procedure instead of a leave-one- 
out cross-validation as leave-one-out cross-validation re¬ 
sults in a high variance of the estimate of the prediction 


error (Hastie et al. 2001). Each step of the cross-validation 


procedure can be summarized as shown in algorithm and 
was repeated 200 times for each model. 

Algorithm 1 cross-validation repetition: 

1: Split the dataset into Learning (X, Y)^ and Validation 
(X,Y)v 

Compute Zi = ln{Y l) 

Fit the H BRT model and estimate 
Fit a variogram onUi = ZL — Z^ 
if 9 and 9 are not valid then 

Winsorize the dataset until valid 9 and 9 are ob¬ 
tained 
end if 

Estimate uy by ordinary kriging at Zy locations using 
the fitted variogram and the Winsorized residuals 
Calculate the lognormal kriging estimate, Yy using 
equ|^ 

Calculate Yy = exp(Lf(Xy)) 

Compute PERF on (Yy, Yy) 

Compute PERF on (Yy, Y^) 


9 : 
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Figure 2: Dowd variograms of the residuals of the LU, 
L and F models to estimate random effects of the LUg, 
Lg and Fg models. Residuals where calculated as the dif¬ 
ference between the log transformed response variable and 
the BRT model predictions. The variograms were obtained 
by fitting the Matern models on the full dataset. 


Steps 4 to 9 are performed as detailed in section 2.2.3 


More specifically, the spatial component of the spatial mod¬ 
els is validated at step 6 as presented in section |2.2.3| in 
order to make sure that these were valid representations 
of the residuals of the BRT models. This check was per¬ 
formed for each geostatistical model fitted during each rep¬ 
etition of the cross-validation procedure. 

Y represents the prediction provided by one given BRT 
model and Y® the prediction provided by its spatial coun¬ 
terpart model (the BRT and the geostatistical model). 
PERF indicates the computation of the performance met¬ 
rics (R2, RMSPE, RPIQ, MPE, MedPE, RMedSPE). We 
should note that the last step of the algorithm represents 
a true external validation of the spatial model because the 
model fitting is performed while masking the observations 
Yy used for validation, both during the variogram fitting 
(step 4) and the kriging procedure (step 8). A similar pro¬ 
cedure has recently been used and advocated by |Goovaerts| 
[and Keriy (2010), using leave-one-out cross-validation. It 


should be distinguished from other approaches where cross- 
validation embeds only the kriging, and not the htting 


of variograms parameters (e.g. 

Chunfaand et al. 

2009 

Mabit and Bernard 2010 

Xie et al. 

2011 

). In these cases. 


observations used for validation have already been used 
for fitting the variogram and the resulting model is not 
independent from these observations. We tested for dif¬ 
ferences between the performances of the six models in 


terms of each performance metric. The distributions of a 
performance metric were compared using a t-test with a 
Bonferroni adjustment. In the following, we use the terms 
MPE, RPIQ, RMSPE, R^, MedPE and RMedSPE names 
to refer to their mean value over the 200 repetitions of 
the cross-validation. The algorithm procedure was pro¬ 
grammed with R software using functionalities of geoR 


2008). 


and sp packages (Ribeiro and Diggle, 2001 Bivand et al. 


3. Results 

3.1. Variogram fitting on BRT residuals 

The degree of spatial correlation of residuals from BRT 
models depended on the complexity of BRT models (Fig[^ : 
the residuals resulting from the LUg model were spatially 
structured with a spatial dependence (defined as partial 
sill/(nugget -|- partial sill). Lark and Cullis 2004) of 0.34. 


Contrary to the residuals of the LUg model, the residuals 
of the more complex BRT models exhibited very limited 
spatial structure (Fig§. For the Fg model, the residuals 
had a spatial dependence of 0.057 and for the Lg model of 
0.1. Fig| indicated that from the simplest model (LUg) 
to the most complex one (Fg), the part of the spatial vari¬ 
ability not accounted for by the deterministic spatial trend 
decreased. 
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For the three models, Winsorizing was needed in order 
to produce valid models regarding the assumption on the 
modelled variable. The 9^, and 0^, values obtained after 
Winsorizing belonged to the confidence interval estimated 
for each model. The percentage of outliers ranged from 
1.9% to 2.8%. These sites present extremely low or high 
SOC stocks that cannot be modelled by the spatial term 
and the BRT models only. The number of such locations 
was halved between the LUg and the Fg models (Table 
as this latter model, the most complex one, was more able 
to model these extreme values. For this model, outliers 
appeared to be evenly distributed over the studied area 
(Figl^i). 

3.2. Cross-validation analysis & performance of the pro¬ 
posed models 

Cross-validation yielded valid spatial models for 100% 
of the cross-validation repetitions. Fig. shows that the 
F and L models and their spatial counterparts performed 
globally similarly to other and differently from the LU 
and LUg model. Average prediction performance of the 
models, expressed by the RPIQ index, ranged for our six 
models between 1.27 and 1.42. Increasing the complexity 
of BRT models resulted in improving the prediction per¬ 
formance and the best R^ value was obtained for the Fg 
model with a value of 0.36. 
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Predictions with the LUg model exhibited, on aver¬ 
age, limited bias (Figj^). Important differences appeared 
when comparing mean prediction errors or mean root squared 
errors to median prediction errors or median squared er¬ 
rors (Fig. l^-f). This indicated a skewed distribution of 
errors. When assessed by the median of the error distribu¬ 
tion (Fig. the geostatistical predictions are shown to 
have a positive median-bias, whereas the standalone BRT 
predictions have median-bias close to zero. Similarly, the 
skewness of the distribution resulted in considerably larger 
root mean squared errors (Fig. il) , with a lowest value of 
2.83 kg/m^ compared to root median squared errors (Fig. 
1^) with a lowest value of 1.43 kg/m^. 


3.3. Performance comparisons of BRT models with or with¬ 
out spatial component 

For our French dataset, adding a spatial term to the 
models resulted in improvements in terms of R2, RPIQ 
and the mean measures of prediction, RMSPE and MPE. 
These improvements were not significant for the L/Lg and 
F/Fg model comparisons, for the R2, RPIQ and RMSPE. 
However, in terms of the median measures, RMedSPE and 
MedPE, the standalone BRT predictions generally gave 
the better results. 

The improvement resulting from the addition of the 
spatial component was a decreasing function of the com¬ 
plexity of the BRT model. This is shown clearly on Fig|^, 
b, d and f. The R^ for the LU model was improved from 


e: MedPE (kq/mg) f: RMedSPE (kq/mg ) 



BRT models 


Figure 3: Performance of the six different models assessed 
using the 6 performance indices. On each diagram, the val¬ 
ues on the x-axis correspond to the aspatial models (BRT 
only): the LU, L and F models. Values on the y-axis corre¬ 
spond to the LU, L and F models plus a spatial term, i.e. 
the LUg, Lg and Fg models. Horizontal and vertical bars 
represent the 95% confidence intervals around mean values 
over the cross validation repetitions, for the BRT models 
only and the BRT with a spatial term models, respectively. 
The dotted lines correspond to the y = x function and for 
the c and e diagrams the y = 0 and the x = 0 lines were 
added. 
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Figure 4: Average prediction error (predicted minus observed values) on each RMQS site over cross-validation repetitions 
where it was considered as independent data, with the LU, LUg, L, Lg, F and Fg models on maps a, b, d, e, g and h 
respectively. Positive values indicate a positive bias and vice versa. Improvements from LU to LUg, L to Lg and F to Fg 
models are given on maps c, f, and i respectively. For instance, map c gives the absolute error of the LU model minus 
the absolute error of the LUg model. Positive values indicate that adding a spatial component improved predictions at 
this location. Size of the dots is an increasing function of absolute errors (or absolute improvement for maps c, f and i). 
Crosses are outliers of spatial models fitted on the whole dataset. 
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Table 1: Fitted variogram parameters in transformed units and cross-validation statistics. Matern parameters: Co is the 
nugget variance, Ci is the partial sill variance, is a spatial parameter expressed in km and k is a smoothness parameter. 
6 and 9 are the validation statistics before Winsorizing and 0^, and 9yj after Winsorizing (for the three models within 
the 95% confidence interval). N is the number of plots Winsorized and c is the Winsorizing constant. 



Co 

Cl 


K 

B 

9 

N 

c 

9w 

9w 

LUg 

0.112 

0.059 

95.99 

0.40 

1.18 

0.46 

61 

2.18 

1.000 

0.445 

Lg 

0.086 

0.010 

11.99 

10.00 

1.15 

0.46 

46 

2.28 

1.000 

0.452 


0.082 

0.005 

16.18 

10.00 

1.12 

0.44 

42 

2.31 

1.000 

0.433 


0.17 to 0.28 when adding a spatial term. The map of errors 
(Figj^) reveals regions where the LU model exhibited a 
strong negative bias, such as south west Brittany (area 1; 
for reference of area numbers, see Fig. [^, and mountain¬ 
ous areas such as the Massif Central (area 5), Alps (area 
4), and Vosges on the eastern part of the French territory 
(area 3). In other regions, it exhibited a positive bias, such 
as some of the parts of the south-west of France. The map 
of improvement between the LU and LUg models (map of 
differences, for each RMQS site, between the absolute er¬ 
rors of prediction with one BRT model and the absolute 
error with its spatial counterpart, Fig|^ for the LU/LUg 
models) shows areas with a dramatic improvement of pre¬ 
dictions, and more specifically where the BRT predictions 
were strongly biased. It should be noted that the strongly 
biased predictions almost disappeared with the most com¬ 
plex model (model F, Fig[^). Some under-estimations 
remained, although much smaller than for the LU model, 
in western coastal areas. 

Measured using the R^ index, the improvement yielded 
by adding a spatial component to the F model was not 
significant, with R^ values going from 0.35 to 0.36. No¬ 
ticeably, the root of the median squared prediction er¬ 
rors exhibited a limited but significant degradation (from 
1.35 to 1.43 for the F and the Fg models, respectively). 
The spatial distribution of improvements (Fig|^) for this 
model was clear for the south west Britanny region. In 
many other areas the improvement was even more limited 
with some sites where prediction was improved and oth¬ 
ers where prediction was degraded (Figj^). These were 
areas with high absolute errors {e.g. the Massif Central 
Fig|§). Interestingly, there was no significant difference 
in the performance of the for Fg and Lg models. This 
result indicated that adding a spatial component to the 
intermediate BRT model yielded similar results to adding 
a spatial component to the most complex model. 

4. Discussion 

4-1. Spatial dependence of SOC stocks 

The spatial dependence of the BRT residuals decreased 
as the complexity of the BRT models was increased. The 
variogram parameters provide some information about SOC 
controlling factors not included in the BRT model. For in¬ 
stance, when land use and clay content is included (in the 
LU model), the correlation range of model residuals lies 



LU 

LUg 

L 

Lg 

F 

Fg 

RPIQ 

1.27 

1.38 

1.42 

1.46 

1.42 

1.45 

R2 

0.17 

0.28 

0.33 

0.35 

0.34 

0.35 

RMSPE 

3.25 

2.97 

2.90 

2.81 

2.89 

2.83 

RMedSPE 

1.61 

1.59 

1.40 

1.45 

1.35 

1.43 

MPE 

-0.60 

-0.02 

-0.49 

-0.16 

-0.48 

-0.19 

MedPE 

0.09 

0.51 

0.06 

0.36 

0.07 

0.34 


Table 2: Performance of the six different models assessed 
using the 6 performance indices. Values given are the mean 
index values over the 200 repetitions of the cross-validation 
procedure. All values but for the R^ and RPIQ indices are 
given in kg/m^. 


between 300 and 400 km (Figj^. This gives an indication 
of the correlation range of the most important SOC con¬ 
trolling factors missing from the LU model. Hence, when 
attempting to improve the LU model of SOC stocks spatial 
dependence, one should look for controlling factors whose 
correlation range is less than 300 to 400 km. The L model 
included other controlling factors, such as clay content, 
which decreased both the total variance of residuals and 
their correlation range, to around 100 to 200 km. Lastly, 
the F model handled most of the spatial dependence by 
including three more drivers, the pH, the parent material 
and the regional statistics regarding organic matter addi¬ 
tions. However, the high nugget in the variograms from 
the residuals of each BRT model, including the F model, 
indicated that other controlling factors greatly influence 
SOC spatial distributions at ranges below the resolution 
of our dataset, i.e. 16 km. This is consistent with the re¬ 
sults of many other studies. For instance, in |Ungar et al.| 
(2010), the residuals of a model of SOC (%) taking into 
account administrative zonation and soil functional types 
were analysed by variography. They also found that most 
of the spatially structured variance was accounted for by 
a short range component (in their case 1500-2000 m). An¬ 
other possible explanation for the high nugget could be 
that the uncertainty attached to most of the covariates 
(drivers) maps is high, especially for the covariates derived 
from the 1:1000 000 soil map. 


4-2. Assessing the performance of one single model 

It is difficult to draw conclusions regarding the perfor¬ 
mance of the present models compared to those of other 
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studies dealing with SOC prediction and mapping. Some 
deal with SOC contents when other deal with SOC den¬ 
sities or stocks. When working on SOC stocks, the bulk 
densities are required, and if these are estimated (rather 
than measured), then the methodology for estimating bulk 


density might have great consequences (Liebens and Van 
[Molle 2003). Many studies use pedotransfer functions 


(PTFs) for estimating bulk density without accounting for 


the associated errors (Schrumpf et al. 2011). Ungar et al. 


(2010) estimated through a Monte Carlo approach that 


uncertainty resulting from their PTF ranged between 0.55 
and 7.72 T ha“^ depending on the SOC content. Schrumpf 


et al. (2011) showed that the use of PTFs for estimating 


bulk densities can lead to wrong or biased estimates of 
SOC stocks. However it is currently not entirely clear 
to what extent measuring bulk densities is worth, con¬ 
sidering the cost. This cost could alternatively be used 
to collect further SOC concentration data and thus im¬ 
prove calibration and validation datasets. Comparison 
between the studies is also made more complex by the 
differences between validation procedures (validation with 
an independent dataset, k-fold cross-validation, leave-one- 


out cross-validation). Furthermore, as quoted by Minasny 
et al. ( 2013| ) and Grunwald (2009), it is quite common 


that validation of SOC model predictions is missing en¬ 
tirely from a study. The best models presented in our 
study (the F and F^ models, see Figj^ performed compa¬ 
rably to those of Lo Seen et al. (2010), htted on soil data 


from the Western Ghats biodiversity hotspot (India). The 
models yielded, using a cross-validation scheme similar to 
the one applied here, RMSPE of 2.6 kg m“^ and of 
0.45, to be compared to the RMSPE of 2.83 kg m“^ and 
R^ of 0.36 obtained here for the Fg model, along with a 
MPE value of -0.19 kg m“^. Considering national SOC 


prediction, Phachomphon et al. (2010) produced 0-100 cm 


estimates using inverse distance weighting with 12 neigh¬ 
bours and ordinary cokriging, yielding MPEs of -0.2 and 
-0.1 kg m~^ and a RMSPE of 2.2 and 2.1 kg m“^, re¬ 


spectively. Mishra et al. (2009) produced estimates, for 
the Indiana state (USA) with a MPE of -0.59 kg m~^ 
and RMSPE of 2.89 kg m“^. This study involved the 
fitting of SOC depth distributions, as recently proposed 
by Kempen et al. ( |2011[ ). This last study, is among the 
most successful, with a rigorous validation scheme and an 
moderate extent (125 km^), giving an R^ of 0.75 for pre¬ 
diction on independent locations. Other studies on areas 
of the same order of magnitude as the area of the French 
territory {i.e. 000 km^) are r eferenced in the compre¬ 

hensive review of Minasny et al. (2013). The R^ values of 


our models are towards the lower end of R^ values in stud¬ 
ies mentioned in this review. Their performance is also 
remarkably lower compared to similar models presented in 


Martin et al. (2011) and Meersmans et al. (2012). This 


drop in model performance is likely a result of the uncer¬ 
tainty of the clay content estimated from the 1:1000 000 
soil map (as compared to the measured clay contents used 
in the previous studies). This is indicated by the impor¬ 


tance (quantified using the BRT variable importance in¬ 
dex) of clay related variables in the L and F models of the 
present study. These variables ranked at best 7*^ and 7*^ 
in the L and F models, respectively. This is to be com¬ 
pared to the first rank obtained by the clay variable in 


the Extra model presented in Martin et al. (2011), fitted 


and validated with measured clay contents. Thus, for the 
two previous studies for the French territory, the model 
performance for mapping might have been overestimated 
because some variables used for validation were observed 
at site level. In the present study, models are validated us¬ 
ing data estimated from ancillary maps, providing a more 
realistic assessment of model performance for mapping. 
Such small differences in the model validation schemes are 
difficult to trace and might further complicate comparison 
between different studies. 

4-3. Distribution of predictions errors 

Another issue worth commenting on here is the distri¬ 
bution of SOC stocks predictions errors. BRT modelling 
of log transformed SOC stocks gave residuals that were 
close to normal, with outliers. These residuals were mod¬ 
elled using a robust geostatistical approach, and a back 
transformation proposed for log-normal ordinary kriging 
was applied. The final predictions exhibited a limited bias 
(MPE=-0.19 kg m“^ for the Fg model, Tablej^, a problem 
that can arise in lognormal kriging due to the sensitivity 
of the back-transform to the variogram parameters and to 


the assumption of a lognormal distribution (Webster and 


Oliver 2007). Although we currently have no ready solu¬ 


tion for providing unbiased predictions, especially for the 
Lg and Fg models, we note that the MPE is small in com¬ 
parison to the RMSPE (less than 5 % of the RMSPE), 
which compares favourably with results of other studies 
reported above. Without the spatial component, the BRT 
predictions (back-transformed with a simple exponential, 
see Algorithm 1), showed negative mean-bias (i.e. under¬ 
prediction on average). This is logical because the BRT 
method ensures unbiased predictions on its predicted vari¬ 
able, here the log transformed SOC stocks; therefore, back- 
transformation of the BRT predictions through the expo¬ 
nential function results in a negative mean-bias for SOC 
stocks on the original scale. 

Further insight is provided by examining the behaviour 
of other performance indices, such as the median predic¬ 
tion error or the median squared prediction error. The log¬ 
normal kriging back-transformation aims to provide mean- 
unbiased predictions on the original scale, hence the rea¬ 
sonably small MPE. However, with a skewed distribution 
of errors, the predictions cannot also be made to be median- 
unbiased, hence the MedPE of the geostatistical predic¬ 
tions is positive. Without the geostatistical component, 
the back-transform of the BRT predictions (through the 
exponential function) preserves the median-unbiased prop¬ 
erty, giving low values of MedPE, but introduces mean- 
bias. Comparisons between the results of our BRT predic¬ 
tions and their spatial counterparts should be made with 
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this in mind; the differences could be at least partly due to 
the different objectives of the back-transformed predictors. 
Since SOC distributions are most commonly as log-normal, 
prediction error distributions are also skewed, and perhaps 
these further measures (MedPE and RMedPE, which are 
robust to extreme prediction errors at a small number of 
locations) can add useful information about model perfor¬ 
mance. 

We note here that the BRT approach could be applied 
to model the SOC stocks directly, without the need for 


|1983[ linear mixed models |Lark and Cullis[ 2004 or more 
generally scorpan-kriging models McBratney et al. 2003). 


any transformation (as shown by Martin et al., 20111. We 
would expect the resulting predictions to have low MPE, 
but a positive MedPE (a similar pattern to the results of 
the geostatistical approach). We tested this direct BRT 
modelling approach with the F BRT model; mean pre¬ 
diction errors were improved from -0.48 to 0.01 kg m“^, 
whilst median prediction errors were increased from 0.07 
to 0.45 kg m“^. In terms of squared errors, the RMSPE 
improved slightly from 2.89 to 2.82 kg m“^, whilst the 
RMedSPE increased from 1.35 to 1.5 kg m“^. In this work, 
we applied BRT modelling to the log-transformed data so 
that residuals would be approximately normal, thus allow¬ 
ing the robust geostatistical approach to be applied. How¬ 
ever, if all that was required was predictions of the SOC 
stock through a BRT approach, then it may be better to 
model the raw SOC stock data directly. 

4.4- Relevance of the models for SOC mapping 

Models comparisons enable one to come up with rec¬ 
ommendations regarding the best models for assessing a 
specific question. Of course, the quality of the models 
should be assessed using several criteria as the question of 
interest is asked within a specific context (data availabil¬ 
ity, nature of the considered systems, available statistical 
and modelling knowledge, computing cost). Several com¬ 
parison criteria may be defined : the Several comparison 
criteria may be defined : the technical knowledge (Know- 
Q) and the pedological knowledge (Know-P) needed for 
fitting, validating and applying models (Grunwald, 20091. 


We may add a criterion related to the nature of the re¬ 
quired datasets, again, for fitting, validating and applying 
models, and another one related to the performance of the 
models, assessed through validation procedures. Although 
other criteria might be defined, those might be considered 
as the main ones for predictive models. The best models 
would be those which, given the available Know-Q, Know- 
P and the datasets, yield the best performance. 

Several studies of SOC mapping include model com¬ 
parison in order to provide the best performing model and 
advices regarding which model should be used in a specific 
context. Comparing the results of these different stud¬ 
ies is not straightforward since the pedological contexts 
change from one study to another. In studies based on 
the application of geostatistics, model comparison is usu¬ 
ally carried out by comparing simple geostatistical models 
with more advanced approaches designed to incorporate 


The conclusion is consistently that including variables rep¬ 
resenting SOC drivers in geostatistical models improves 


model performance (for instance see 

Kempen et al. 

2011 

Vasques et al. 

2010 Ungar et al. 

2010). The cost of such 


an improvement is that it leads to an increase of the Know- 
P and the Know-Q. On one hand, such models might in¬ 
volve a great amount of technicality. On the other hand, 
the availability at observational sites of information re¬ 
garding the included drivers is then also required for the 
fit, the validation and later on the prediction. 

Fewer studies considered the question the other way 
around by including geostatistics in regression-based scor- 
pan models, such as the BRT models considered here or 
by comparing regression models to regression-kriging mod¬ 
els. On a 187,693 km^ area, Zhao and Shi ( 2010| showed 
that simple regression trees (RT) exhibited the best perfor¬ 
mance when compared to regression kriging and artificial 
neural network-kriging, among other methods. They con¬ 
cluded that their predictive models mostly rely on their 
ability to integrate secondary information into spatial pre¬ 
diction. In our case, the conclusions are contrasted. The 
LUg model applied a robust geostatistical approach to the 
residuals of the simplest BRT model (the LU model, which 
included land use and clay content as the only fixed effects, 
among the most important SOC drivers at the national 
scale of France, Martin et al., 2011| . This approach exhib¬ 
ited comparable but lower performance, in terms of R^, 
RPIQ, and RMSPE compared to the more complex re¬ 
gression models (L and F) processing all the available an¬ 
cillary data. Therefore, we conclude that adding a spatial 
component to a simple regression model can give similar 
improvements to adding more predictors to the model. 

Unbiased predictions might be achieved either by BRT 


modelling on the original scale (as shown by Martin et al. 


2011) or by BRT modelling of the log-transformed re¬ 


covariate data (e.g. cokriging, McBratney and Webster 


sponse and applying a geostatistical treatment. When it 
comes to mapping, one may wonder if preserving the mean 
of SOC stock distributions is more important than pre¬ 
serving the median. The mean might be more imporant 
in order to report total SOC stocks at the national scale, 
but preserving the median might result in more realistic 
maps. It is essentially a modelling choice, as to whether 
mean-unbiasedness or median-unbiasedness is required. 

4-5. Further recommendations for SOC mapping at the 
national seale 

Our best model (the Eg model) only explained 36% of 
the SOC variation. It is possible that local kriging meth¬ 
ods, rather than the global kriging applied here, could lead 
to improved predictions in some areas, although the choice 
of appropriate local neighbourhood sizes then provides an 
additional issue. Other regression models could be tested, 
such as support vector machines (SVM), random forests 
(]Hastie et al. 20011 or the Cubist modelling approach {e.g. 


12 






















































Bui et al. 20091. These models could result in different 


residual distributions but in our opinion, the consequences 
on the performance of their spatial couterpart are likely 
to be limited. Some of them, such as SVM require more 
technical knowledge, thus increasing the Know-Q factor, 
compared to the BRT models proposed here, for which 


efficient working guides have been proposed (Elith et al. 


2008). Grunwald (2009) stated that the future improve¬ 


ments in the prediction of soil properties does not rely on 
more sophisticated quantitative methods, but rather on 
gathering more useful and higher quality data. Choosing 
between gathering more data or improving the modelling 
is indeed the choice modellers are facing when attempt¬ 
ing to improve SOC maps. We show here that the choice 
might not be as straightforward as stated by [Grunwald 
(2009) : at the national scale, even a simple model based 


solely on landuse and clay, when complemented by geo¬ 
statistics, performed comparably to a model where all the 
available ancillary data was included (for France, at the 
time of the study, these were land use, soil, climate and 
npp maps). Therefore, for a country where only landuse 
and clay maps were available, the most efficient way to 
improve predictions in the short term would certainly be 
to consider geostatistical modelling of residuals (be. im¬ 
proving the modelling, rather than gathering new ancillary 
data). Furthermore, other datasets, on the same extent 
(be. national extent) but with different resolution might 
be more suited to geostatistics. Here, the 16xl6km^ does 
not allow for modelling spatial autocorrelations occurring 
at small scales. Many studies have demonstrated such an 
autocorrelation when more ’’local” neighbourhoods can be 


studied ( 

Mabit and Bernard 

2010 

Don et al. 

2007 

Hossi 

jet ai.||20091 Yun-Qiang et al. 

2009 

Spielvogel et al.l 

2009 


with an extent i50km^ andjMishra et al.l|2009|at coarser 


extents and using a non-systematic sampling scheme). 

On the other hand, adding spatial terms to the most com¬ 
plex models only increased know-Q to our data-analysis 
scheme. More generally, the higher the uncertainty in 
maps of ancillary variables, the more likely it is that mod¬ 
els based solely on SOC spatial dependency or including 
only few good quality (in terms of data uncertainty) pre¬ 
dictors will outperform complex models using many ancil¬ 
lary variables. 

For France, other SOC predictors could be included in 
our regression models, and result in signihcant improve¬ 


ments. There are different possibilities (Martin et al. 


2011), but of course, these improvements depend on the 


increase in Know-P and data-collection one is willing to 
consider. Having a better soil map is obviously a very 
good candidate. This is exemplified here by the drop in 
the F model performance between the present study and 


the work by Martin et al. (2011). 


It is also worth noting that an advantage of using mul¬ 
tiple regression tools, such as the BRT models, comes from 
studying the fitted relationships between the response and 
the predictors, which may in turn bring additional knowl¬ 


edge. For instance, BRT was used in Martin et al. (2011) 


to rank the effects of the SOC stocks driving factors. 


5. Conclusion 


Based on the results of the present study, and others 
found in the literature, we formulate the following rec¬ 
ommendations. These recommendations apply for France 
but the French diversity in terms of pedoclimatic condi¬ 
tions might make these recommendations valid for other 
countries as well. If the information contained in the re¬ 
lationships between the ancillary variables and the SOC 
stocks are strong enough, then standalone robust regres¬ 
sion models such as BRT - which enable one to take into 
account in a flexible way non-linearities and interactions 
exhibited by the datasets - could prove sufficient for SOC 
mapping at the national scale. This conclusion is valid 


provided that i) care is exercised in model htting (Elith 
et al. 2008) and validating, ii) the dataset does not allow 
for modelling local spatial autocorrelations, as it is the 
case for many national systematic sampling schemes, and 
iii) the ancillary data are of suitable quality. However, the 
results in this paper demonstrate that it should also be 
prudent to use geostatistical methods to check for spatial 
autocorrelation in the BRT residuals. If found, which was 
the case for the simpler of our BRT models (which failed to 
capture all the important SOC drivers at a national scale), 
then a kriging approach applied to the BRT residuals can 
provide a more accurate map of SOC stocks. Furthermore, 
even if the spatial correlation fails to significantly improve 
SOC predictions globally, it is possible that by mapping 
the BRT model residuals we can highlight regional errors 
in the BRT model, and thereby provide information to 
guide research into further SOC model development. 
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